Skip to content

cmd/k8s-operator: support custom TLS Secrets on Ingress#2

Draft
ryantm wants to merge 14 commits intomainfrom
zerg/operator-wildcard-cert
Draft

cmd/k8s-operator: support custom TLS Secrets on Ingress#2
ryantm wants to merge 14 commits intomainfrom
zerg/operator-wildcard-cert

Conversation

@ryantm
Copy link
Copy Markdown

@ryantm ryantm commented Mar 25, 2026

Why

We want the Kubernetes operator to serve custom HTTPS hostnames like zerg.zergrush.dev while preserving the existing MagicDNS/Tailscale hostname path and the Tailscale-* identity model for ingress backends. That lets zergrush move web traffic onto operator-managed custom TLS without keeping a long-lived custom gateway in zergrush itself.

What changed

  • cherry-pick upstream PR tailscale/tailscale#18636 so tailscale.com/accept-app-caps works on both standard and ProxyGroup-backed Ingress resources
  • add support for using Ingress.spec.tls[0].secretName as a custom TLS certificate source for operator-managed ingress, copying the custom cert into proxy state secrets so the proxy can terminate that hostname directly
  • preserve the existing MagicDNS hostname alongside the custom TLS hostname by adding both HTTPS hosts to the serve config, so zerg.tail0a469.ts.net keeps working while zerg.zergrush.dev is introduced
  • teach ipnlocal VIP service routing to match exact custom HTTPS hostnames before falling back to the service's MagicDNS FQDN
  • add TLS Secret watch/index plumbing plus unit tests for standard ingress, ProxyGroup ingress, and custom service-host routing

Test plan

  • ./tool/go fmt ./cmd/k8s-operator ./ipn/ipnlocal
  • ./tool/go test ./cmd/k8s-operator ./ipn/ipnlocal
  • ./tool/go test ./cmd/k8s-operator/...

Revertibility

Safe to revert. The changes are limited to operator/runtime ingress behavior and tests; reverting restores the previous Tailscale-managed certificate behavior for ingress resources.

~ written by Zerg 👾

matthalp and others added 3 commits March 25, 2026 12:06
Add support for the tailscale.com/accept-app-caps annotation on Ingress
resources. This populates the AcceptAppCaps field on HTTPHandler entries
in the serve config, which causes the serve proxy to forward matching
peer capabilities in the Tailscale-App-Capabilities header to backends.

The annotation accepts a comma-separated list of capability names
(e.g. "example.com/cap/monitoring,example.com/cap/admin"). Each
capability is validated against the standard app capability regex.
Invalid capabilities are skipped with a warning event, consistent
with the operator's soft-validation pattern.

Both the standard Ingress reconciler and the HA (ProxyGroup) Ingress
reconciler benefit from this change since they share the same
handlersForIngress() function.

Updates #tailscale/corp#28049

Signed-off-by: matthalp <mhalpern@column.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ryantm ryantm force-pushed the zerg/operator-wildcard-cert branch from e8dd7a2 to 0cdbe23 Compare March 25, 2026 22:06
ryantm and others added 11 commits March 25, 2026 16:08
When multiple environments share the same custom domain structure
(e.g., zerg.staging.zergrush.dev, zerg.testing.zergrush.dev), the
hostnameForIngress function derives the same service name (svc:zerg)
for all of them because it only reads the first DNS label of the TLS
host.

Add a tailscale.com/service-name annotation that takes precedence
over the TLS host for service name derivation. This lets operators
explicitly control the Tailscale Service name per Ingress:

  annotations:
    tailscale.com/proxy-group: zerg-west1-staging-ingress
    tailscale.com/service-name: zerg-west1-staging

The annotation is optional — without it, existing behavior is
unchanged (service name derived from first DNS label of TLS host).
Relax the Ingress TLS validation to allow multiple hosts within one
TLS entry. The first host is used for service name derivation (or the
tailscale.com/service-name annotation). Additional hosts are served
using the same custom TLS cert from secretName.

This is needed for serving both the primary domain and apex domain
(e.g., zerg.zergrush.dev + zergrush.dev) with the same wildcard cert.

Only one TLS entry is still enforced (multiple entries are rejected).
…TLS entry

Previously, when an Ingress had multiple hosts in a single TLS entry
(e.g. hosts: ["zerg.zergrush.dev", "zergrush.dev"]), only the first
host got the custom cert registered. Additional hosts failed with TLS
internal error because the proxy couldn't find a cached cert for them.

Fix by changing ingressCustomTLS.host (string) to hosts ([]string) and
propagating all hosts through the cert registration pipeline:
- copyCustomTLSSecretData writes cert/key data for every host
- ingressHTTPSHosts includes all custom hosts in serve config
- CustomTLSCerts map is populated for all hosts
- handlersForIngress accepts rules matching any TLS host
…ts.net domains

When a user provides a custom TLS cert via Kubernetes Ingress spec.tls,
the proxy should serve it regardless of x509 chain or domain validation
results. Previously, validCertPEM would reject certs that failed
verification (e.g., wildcard cert not covering the apex domain,
self-signed cert, private CA), causing TLS internal errors for
additional hosts in multi-host TLS entries.

Add ReadRaw to certStore that reads cert/key checking only parseability
(tls.X509KeyPair), not x509 chain/domain validity. In
GetCertPEMWithValidity, when a domain is definitively not ACME-managed
(errNotACMEManaged) and doesn't end in .ts.net, fall back to ReadRaw
when getCertPEMCached returns errCertExpired.
…CME certs

For custom (non-ts.net) domains, check the pod's state Secret before
the domain-specific Secret when looking up TLS certs. Custom TLS certs
from Ingress spec.tls entries are written to the state Secret by the
operator, while ACME-provisioned certs are stored in domain-specific
Secrets. Without this ordering, an ACME cert for the same domain would
shadow the user-provided custom cert.

For ts.net domains, the lookup order is unchanged (domain-specific
Secret first) to avoid extra API calls on the common ACME path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants